Search CORE

44 research outputs found

SupWSD: a flexible toolkit for supervised word sense disambiguation

Author: DELLI BOVI Claudio
Papandrea Simone
Raganato Alessandro
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

In this demonstration we present SupWSD, a Java API for supervised Word Sense Disambiguation (WSD). This toolkit includes the implementation of a state-of-the-art supervised WSD system, together with a Natural Language Processing pipeline for preprocessing and feature extraction. Our aim is to provide an easy-to-use tool for the research community, designed to be modular, fast and scalable for training and testing on large datasets. The source code of SupWSD is available at http://github.com/SI3P/SupWSD

Archivio della ricerca- Università di Roma La Sapienza

An Analysis of Encoder Representations in Transformer-Based Machine Translation

Author: Raganato Alessandro
Tiedemann Jörg
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2018
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Archivio della ricerca- Università di Roma La Sapienza

Multilingual NMT with a language-independent attention bridge

Author: Creutz Mathias
Raganato Alessandro
Tiedemann Jörg
Vázquez Raúl
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/11/2018
Field of study

In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Entity Linking meets Word Sense Disambiguation: A Unified Approach

Author: MORO ANDREA
NAVIGLI ROBERTO
raganato alessandro
Publication venue: Michael Collins, Lillian Lee
Publication date: 01/01/2014
Field of study

Entity Linking (EL) and Word Sense Disambiguation (WSD) both address the lexical ambiguity of language. But while the two tasks are pretty similar, they differ in a fundamental respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while in WSD there is a perfect match between the word form (better, its lemma) and a suitable word sense. In this paper we present Babelfy, a unified graph-based approach to EL and WSD based on a loose identification of candidate meanings coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations. Our experiments show state-ofthe-art performances on both tasks on 6 different datasets, including a multilingual setting. Babelfy is online at http://babelfy.orgEntity Linking (EL) and Word Sense Disambiguation (WSD) both address the lexical ambiguity of language. But while the two tasks are pretty similar, they differ in a fundamental respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while in WSD there is a perfect match between the word form (better, its lemma) and a suitable word sense. In this paper we present Babelfy, a unified graph-based approach to EL and WSD based on a loose identification of candidate meanings coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations. Our experiments show state-ofthe-art performances on both tasks on 6 different datasets, including a multilingual setting. Babelfy is online at http://babelfy.or

Archivio della ricerca- Università di Roma La Sapienza

New frontiers in supervised word sense disambiguation: building multilingual resources and neural models on a large scale

Author: Raganato Alessandro
Publication venue
Publication date: 12/02/2018
Field of study

Word Sense Disambiguation is a long-standing task in Natural Language Processing (NLP), lying at the core of human language understanding. While it has already been studied from many different angles over the years, ranging from knowledge based systems to semi-supervised and fully supervised models, the field seems to be slowing down in respect to other NLP tasks, e.g., part-of-speech tagging and dependencies parsing. Despite the organization of several international competitions aimed at evaluating Word Sense Disambiguation systems, the evaluation of automatic systems has been problematic mainly due to the lack of a reliable evaluation framework aiming at performing a direct quantitative confrontation. To this end we develop a unified evaluation framework and analyze the performance of various Word Sense Disambiguation systems in a fair setup. The results show that supervised systems clearly outperform knowledge-based models. Among the supervised systems, a linear classifier trained on conventional local features still proves to be a hard baseline to beat. Nonetheless, recent approaches exploiting neural networks on unlabeled corpora achieve promising results, surpassing this hard baseline in most test sets. Even though supervised systems tend to perform best in terms of accuracy, they often lose ground to more flexible knowledge-based solutions, which do not require training for every disambiguation target. To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long ShortTerm Memory to encoder-decoder models. Our extensive evaluation over standard benchmarks and in multiple languages shows that sequence learning enables more versatile all-words models that consistently lead to state-of-the-art results, even against models trained with engineered features. However, supervised systems need annotated training corpora and the few available to date are of limited size: this is mainly due to the expensive and timeconsuming process of annotating a wide variety of word senses at a reasonably high scale, i.e., the so-called knowledge acquisition bottleneck. To address this issue, we also present different strategies to acquire automatically high quality sense annotated data in multiple languages, without any manual effort. We assess the quality of the sense annotations both intrinsically and extrinsically achieving competitive results on multiple tasks

Archivio della ricerca- Università di Roma La Sapienza

Personalization in BERT with Adapter Modules and Topic Modelling

Author: Braga Marco
Pasi Gabriella
Raganato Alessandro
Publication venue: CEUR
Publication date: 01/01/2023
Field of study

As a result of the widespread use of intelligent assistants, personalization in dialogue systems has become a hot topic in both research and industry. Typically, training such systems is computationally expensive, especially when using recent large language models. To address this challenge, we develop an approach to personalize dialogue systems using adapter layers and topic modelling. Our implementation enables the model to incorporate user-specific information, achieving promising results by training only a small fraction of parameters

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Author: Creutz Mathias
Raganato Alessandro
Tiedemann Jorg
Vazquez Raul
Publication venue
Publication date: 01/06/2020
Field of study

Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Recent Trends in Word Sense Disambiguation : A Survey

Author: Bevilacqua Michele
Navigli Roberto
Pasini Tommaso
Raganato Alessandro
Publication venue: International Joint Conference on Artificial Intelligence, Inc
Publication date: 01/01/2021
Field of study

Survey TrackPeer reviewe

Helsingin yliopiston digitaalinen arkisto

Archivio della ricerca- Università di Roma La Sapienza

sew embed at semeval 2017 task 2 language independent concept representations from a semantically enriched wikipedia

Author: Alessandro Raganato
Claudio Delli Bovi
Publication venue
Publication date: 30/01/2018
Field of study

Open Access Repository

A Large-Scale Multilingual Disambiguation of Glosses

Author: CAMACHO COLLADOS JOSE'
DELLI BOVI CLAUDIO
NAVIGLI ROBERTO
raganato alessandro
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2016
Field of study

Linking concepts and named entities to knowledge bases has become a crucial Natural Language Understanding task. In this respect, recent works have shown the key advantage of exploiting textual definitions in various Natural Language Processing applications. However, to date there are no reliable large-scale corpora of sense-annotated textual definitions available to the research community. In this paper we present a large-scale high-quality corpus of disambiguated glosses in multiple languages, comprising sense annotations of both concepts and named entities from a unified sense inventory. Our approach for the construction and disambiguation of the corpus builds upon the structure of a large multilingual semantic network and a state-of-the-art disambiguation system; first, we gather complementary information of equivalent definitions across different languages to provide context for disambiguation, and then we combine it with a semantic similarity-based refinement. As a result we obtain a multilingual corpus of textual definitions featuring over 38 million definitions in 263 languages, and we make it freely available at http://lcl.uniroma1.it/disambiguated-glosses. Experiments on Open Information Extraction and Sense Clustering show how two state-of-the-art approaches improve their performance by integrating our disambiguated corpus into their pipeline

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza